Exchangeability and regression models
نویسنده
چکیده
Sir David Cox’s statistical career and his lifelong interest in the theory and application of stochastic processes began with problems in the wool industry. The problem of drafting a strand of wool yarn to near uniform width is not an auspicious starting point, but an impressive array of temporal and spectral methods from stationary time series were brought to bear on the problem in Cox (1949). His ability to extract the fundamental from the mundane became evident in his discovery or construction of the eponymous Cox process in the counting of neps in a sample of wool yarn (Cox, 1955). Subsequent applications include hydrology and long-range dependence (Davison and Cox 1989; Cox 1991), models for rainfall (Cox and Isham 1988; Rodriguez-Iturbe, Cox and Isham 1987, 1988), and models for the spread of infectious diseases (Anderson, Cox and Hillier 1989). At some point in the late 1950s, the emphasis shifted to statistical models for dependence, the way in which a response variable depends on known explanatory variables or factors. Highlights include two books on the planning and analysis of experiments, seminal papers on binary regression, the Box-Cox transformation, and an oddly-titled paper on survival analysis. This brief summary is a gross simplification of Sir David’s work, but it suits my purpose by way of introduction because the chief goal of this chapter is to explore the relation between exchangeability, a concept from stochastic processes, and regression models in which the observed process is modulated by a covariate. A stochastic process is a collection of random variables, Y1, Y2, . . ., usually an infinite set though not necessarily an ordered sequence. What this means is that U is an index set on which Y is defined, and for each finite subset S = {u1, . . . , un} of elements in U , the value Y (S) = (Y (u1), . . . , Y (un) ) of the process on S has distribution PS on RS . This chapter is a little unconventional in that it emphasizes probability distributions rather than random variables. A real-valued process is thus a consistent assignment of probability distributions to observation spaces such that the distribution Pn on Rn is the marginal distribution of Pn+1 on Rn+1 under deletion of the relevant coordinate. A notation such as Rn that puts undue emphasis on the incidental, the dimension of the observation space, is not entirely satisfactory. Two samples of equal size need not have the same distribution, so we write RS rather than Rn for the set of real-valued functions on the sampled units, and PS for the distribution. A process is said to be exchangeable if each finite-dimensional distribution is symmetric, or invariant under coordinate permutation. The definition suggests that exchangeability can have no role in statistical models for dependence, in which the distributions are overtly non-exchangeable on account of differences in covariate values. I argue that this narrow view is mistaken for two reasons. First, every regression model is a set of processes in which the distributions are indexed by the finite restrictions of the covariate, and regression exchangeability is defined naturally with that in mind. Second, regression
منابع مشابه
Exchangeable lower previsions
This paper deals with belief models for both finite and countable sequences of exchangeable random variables taking a finite number of values. When such sequences of random variables are assumed to be exchangeable, this more-or-less means that the specific order in which they are observed is deemed irrelevant. The first detailed study of exchangeability was made by de Finetti [5] (with the term...
متن کاملTractability through Exchangeability: A New Perspective on Efficient Probabilistic Inference
Exchangeability is a central notion in statistics and probability theory. The assumption that an infinite sequence of data points is exchangeable is at the core of Bayesian statistics. However, finite exchangeability as a statistical property that renders probabilistic inference tractable is less well-understood. We develop a theory of finite exchangeability and its relation to tractable probab...
متن کاملEdge-exchangeable graphs and sparsity
A known failing of many popular random graph models is that the Aldous–Hoover Theorem guarantees these graphs are dense with probability one; that is, the number of edges grows quadratically with the number of nodes. This behavior is considered unrealistic in observed graphs. We define a notion of edge exchangeability for random graphs in contrast to the established notion of infinite exchangea...
متن کاملStandard errors for regression on relational data with exchangeable errors
Relational arrays represent interactions or associations between pairs of actors, often over time or in varied contexts. We focus on the case where the elements of a relational array are modeled as a linear function of observable covariates. Due to the inherent dependencies among relations involving the same individual, standard regression methods for quantifying uncertainty in the regression c...
متن کاملTractability through Exchangeability: A New Perspective on Efficient Probabilistic Inference [Highlight on Published Work]
Exchangeability is a central notion in statistics and probability theory. The assumption that an infinite sequence of data points is exchangeable is at the core of Bayesian statistics. However, finite exchangeability as a statistical property that renders probabilistic inference tractable is less well-understood. We develop a theory of finite exchangeability and its relation to tractable probab...
متن کامل